How to Evaluate Dimensionality Reduction? - Improving the Co-ranking Matrix
نویسندگان
چکیده
The growing number of dimensionality reduction (DR) methods available for data visualization has recently inspired the development of quality assessment measures, in order to evaluate the resulting low-dimensional representation independently from a methods’ inherent criteria. Several (existing) quality measures can be (re)formulated based on the so-called co-ranking matrix, which subsumes all rank errors (i.e., differences between the ranking of distances from every point to all others, comparing the low-dimensional representation to the original data). The measures are often based on the partioning of the co-ranking matrix into 4 submatrices, divided at the K-th row and K-th column, calculating a weighted combination of the sums of each submatrix. Hence, the evaluation process typically involves plotting a graph over several (or even all possible) settings of the parameter K. Considering simple artificial examples, we argue that this parameter controls two notions at once, that need not necessarily be combined, and that the rectangular shape of submatrices is disadvantageous for an intuitive interpretation of the parameter. We debate that quality measures, as general and flexible evaluation tools, should have parameters with a direct and intuitive interpretation as to which specific error types are tolerated or penalized. Therefore, we propose to replace the parameter K with two distinct parameters to control these notions separately, and introduce a differently shaped weighting scheme on the co-ranking matrix. The two new parameters can then directly be interpreted, respectively, as a threshold up to which rank errors are tolerated, and a threshold up to which the rank-distances are significant for the quality evaluation. Moreover, we propose a color representation of local quality to visually support the evaluation process for a given mapping, where every point in the mapping is colored according to its local contribution to the overall quality value.
منابع مشابه
Rank-based quality assessment of nonlinear dimensionality reduction
Nonlinear dimensionality reduction aims at providing lowdimensional representions of high-dimensional data sets. Many new methods have been proposed in the recent years, but the question of their assessment and comparison remains open. This paper reviews some of the existing quality measures that are based on distance ranking and K-ary neighborhoods. Many quality criteria actually rely on the a...
متن کاملQuality assessment of nonlinear dimensionality reduction based on K-ary neighborhoods
Nonlinear dimensionality reduction aims at providing low-dimensional representions of high-dimensional data sets. Many new methods have been recently proposed, but the question of their assessment and comparison remains open. This paper reviews some of the existing quality measures that are based on distance ranking and K-ary neighborhoods. In this context, the comparison of the ranks in the hi...
متن کاملDimensionality Reduction and Improving the Performance of Automatic Modulation Classification using Genetic Programming (RESEARCH NOTE)
This paper shows how we can make advantage of using genetic programming in selection of suitable features for automatic modulation recognition. Automatic modulation recognition is one of the essential components of modern receivers. In this regard, selection of suitable features may significantly affect the performance of the process. Simulations were conducted with 5db and 10db SNRs. Test and ...
متن کاملManifold Ranking using Hessian Energy
In recent years, learning on manifolds has attracted much attention in the academia community. The idea that the distribution of real-life data forms a low dimensional manifold embedded in the ambient space works quite well in practice, with applications such as ranking, dimensionality reduction, semi-supervised learning and clustering. This paper focuses on ranking on manifolds. Traditional ma...
متن کامل2D Dimensionality Reduction Methods without Loss
In this paper, several two-dimensional extensions of principal component analysis (PCA) and linear discriminant analysis (LDA) techniques has been applied in a lossless dimensionality reduction framework, for face recognition application. In this framework, the benefits of dimensionality reduction were used to improve the performance of its predictive model, which was a support vector machine (...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1110.3917 شماره
صفحات -
تاریخ انتشار 2011